Two-phase importance sampling for inference about transmission trees
نویسندگان
چکیده
There has been growing interest in the statistics community to develop methods for inferring transmission pathways of infectious pathogens from molecular sequence data. For many datasets, the computational challenge lies in the huge dimension of the missing data. Here, we introduce an importance sampling scheme in which the transmission trees and phylogenies of pathogens are both sampled from reasonable importance distributions, alleviating the inference. Using this approach, arbitrary models of transmission could be considered, contrary to many earlier proposed methods. We illustrate the scheme by analysing transmissions of Streptococcus pneumoniae from household to household within a refugee camp, using data in which only a fraction of hosts is observed, but which is still rich enough to unravel the within-household transmission dynamics and pairs of households between whom transmission is plausible. We observe that while probability of direct transmission is low even for the most prominent cases of transmission, still those pairs of households are geographically much closer to each other than expected under random proximity.
منابع مشابه
Postprocessing of genealogical trees.
We consider inference for demographic models and parameters based upon postprocessing the output of an MCMC method that generates samples of genealogical trees (from the posterior distribution for a specific prior distribution of the genealogy). This approach has the advantage of taking account of the uncertainty in the inference for the tree when making inferences about the demographic model a...
متن کاملInference of Transmission Network Structure from HIV Phylogenetic Trees
Phylogenetic inference is an attractive means to reconstruct transmission histories and epidemics. However, there is not a perfect correspondence between transmission history and virus phylogeny. Both node height and topological differences may occur, depending on the interaction between within-host evolutionary dynamics and between-host transmission patterns. To investigate these interactions,...
متن کاملDynamic importance sampling in Bayesian networks using factorisation of probability trees
Factorisation of probability trees is a useful tool for inference in Bayesian networks. Probabilistic potentials some of whose parts are proportional can be decomposed as a product of smaller trees. Some algorithms, like lazy propagation, can take advantage of this fact. Also, the factorisation can be used as a tool for approximating inference, if the decomposition is carried out even if the pr...
متن کاملSimultaneous inference of phylogenetic and transmission trees in infectious disease outbreaks
Whole-genome sequencing of pathogens from host samples becomes more and more routine during infectious disease outbreaks. These data provide information on possible transmission events which can be used for further epidemiologic analyses, such as identification of risk factors for infectivity and transmission. However, the relationship between transmission events and sequence data is obscured b...
متن کامل2 nd Cornell Probability Summer School 2006
A diffusion process model of the frequency of a mutation • Reversibility of a 1-dimensional diffusion process • Frequency spectrum and age of a mutation General binary coalescent trees • Combinatorial derivation of the age of a mutation • Ewens' sampling formula, a combinatorial derivation • Coalescent lineage distributions Gene trees and Coalescent trees • DNA sequences and the infinitely-many...
متن کامل